Design of issue tracking/reporting via GitHub #1595

yarikoptic · 2023-05-22T14:55:14Z

To address #863

TODOs:

incorporate feedback from Roni

doc/design/dandiset-issues-tracking.md

CodyCBakerPhD · 2023-06-05T18:23:58Z

Overall I think this looks good; one thing I would strongly recommend is a way to ensure the dandiset owners are alerted whenever an issue is reported

Ideas from the meeting included using the dandi user ID (which is GitHub ID too) to auto subscribe to the repo upon creating or otherwise use the '@' operator pre-populated in the issue template

doc/design/dandiset-issues-tracking.md

Co-authored-by: Cody Baker <[email protected]>

bendichter · 2023-09-21T22:11:52Z

I agree, this looks good to me.

… bot

yarikoptic · 2023-10-13T18:41:30Z

having discussed this a little more during ODIN with @rly (also ping @magland) -- decided to switch approach a little to

not rely on @handles but rather to invite authors to become Contributors in Triage role.
aim to develop a "bot" which would perform prescribed actions (user management, changes acceptance, etc) - Design a "bot" to assist "managing" of the dandisets by authorized users dandisets#360

Please chime in with what do you think - may be there is some complication you can see which I have missed.

bendichter · 2023-10-16T12:50:37Z

@yarikoptic I agree with the proposed changes and with the justification of those changes.

waxlamp

Overall, I think this could be a good way to handle dandiset-specific issues. I have the following concerns:

How do future deployments of dandi-archive deal with this? There won't always be a magical DataLad org filled with one repository per dandiset. Perhaps a better way to do this is simply to use a single issue tracker for all issues (tagged with dandiset ID, perhaps); then, the configuration for a dandi archive deploy can point to a single issue tracker to fulfill this functionality.
What are the alternative approaches here? It seems a little bit odd for neuroscientists to use a code-specific issue tracker to report problems with datasets. Is there a different service out there that might integrate more cleanly with the archive?

bendichter · 2023-10-24T01:08:58Z

@waxlamp Thanks for the thoughts.

There won't always be a magical DataLad org filled with one repository per dandiset.

Why not? Why do you consider this magic?

Perhaps a better way to do this is simply to use a single issue tracker for all issues

We could do this but we'd need to sort out a few features.

How to associate issues with specific dandisets. (could be done with an issue form I guess)
How to notify the dandiset owners when an issue is filed in relation to that dandiset.
An easy way for users to filter issues by dandiset

Is there a different service out there that might integrate more cleanly with the archive?

There certainly are other ticketing systems out there, but GitHub is free and more familiar to neuroscientists than any other platform. It also has a number of useful features:

We could set up GitHub Actions to automate processing of issues.
We can assign issues to projects to prioritize them
GitHub has a REST API we can use to integrate issues with the main website if we want to

yarikoptic · 2023-10-24T14:49:15Z

Valid concerns, thanks @waxlamp and thanks @bendichter for the points too!

After all we also have staging dandi and it would need to have "its own" . I think it might be possible to plan for both ie template out the URLs for issues, given e.g. {gh_org} (e.g. dandisets), gh_repo (e.g. empty '' or 'dandisets-staging/' if a single repo), and {dandiset} placeholder

main instance - per dandiset repo like we have gh_org = dandisets, gh_repo = ''
- list of issues: https://github.com/{gh_org}/{gh_repo}/{dandiset}/issues, e.g. https://github.com/dandisets/000026/issues
- file a new issue: https://github.com/{gh_org}/{gh_repo}/{dandiset}/issues/new e.g. https://github.com/dandisets/000026/issues/new
staging - a single repo, e.g. https://github.com/dandi/dandisets-staging : gh_org = dandi, gh_repo = dandisets-staging

list of issues: https://github.com/{gh_org}/{gh_repo}/issues?q=is%3Aissue+is%3Aopen+%5B{dandiset}%5D e.g. https://github.com/dandi/dandisets-staging/issues?q=is%3Aissue+is%3Aopen+%5B600789%5D
file a new issue: https://github.com/{gh_org}/{gh_repo}/issues/new?title=[{dandiset}]: e.g. https://github.com/dandi/dandisets-staging/issues/new?title=[600789]:%20

But such flexibility might complicate future "tighter" integration, e.g. querying for a number of open issues etc to include that information on DLP.

Probably a more lightweight alternative to aiming for 2 setups would be to concentrate on the proposed one here, and

Setup dandisets-staging organization
Add minimal functionality of establishing a new dandiset repository under a specified organization to dandi-archive itself (now it is part of the dataladification script)
- it would require specification/storing of a github auth token for each instance.
- would not be called if no configuration to do so is setup (e.g. in the unittests)
Adjust dataladification script to "tollerate" a repository already existing but empty (I think we actually do not need to do anything since we have existing="reconfigure": https://github.com/dandi/dandisets/blob/HEAD/tools/backups2datalad/adataset.py#L374 ).

@waxlamp WDYT about adding functionality to dandi-archive to call out to GITHUB API to create a new repository upon request to create a new dandiset? Or would you prefer to go "2 possible setups" way? (that would require only 1 time manual setup but more parametrization for each instance)

magland · 2023-10-24T15:59:37Z

So if there are thousands of dandisets, there would be thousands of gh repos under an organization maintained by dandi? Seems like it could become difficult to manage. What about having the author of the dandiset optionally contribute and maintain their own gh repo for this?

yarikoptic · 2023-10-25T20:53:10Z

So if there are thousands of dandisets, there would be thousands of gh repos under an organization maintained by dandi?

yes. Glorious the time when we get 1000s of nice non-empty datasets! we will celebrate!

Seems like it could become difficult to manage.

with the right tools should be doable and not unprecedented:

we already have them all under https://github.com/dandisets/ and https://github.com/dandizarrs
OpenNeuro also uses https://github.com/OpenNeuroDatasets/ and https://github.com/OpenNeuroDerivatives/

What about having the author of the dandiset optionally contribute and maintain their own gh repo for this?

an interesting idea. In principle we can already allow people to link github repos within dandiset metadata.

quick grep already shows a good number of mentionings of github within dataset metadata

dandi@drogon:/mnt/backup/dandi/dandisets$ grep github */dandiset.yaml | grep -v dandi/schema
000008/dandiset.yaml:  url: https://github.com/berenslab/mini-atlas
000016/dandiset.yaml:  NWB files is provided at https://github.com/ttngu207/najafi-2018-nwb/blob/master/notebooks/Najafi-2018_example.ipynb.'
000026/dandiset.yaml:  url: https://biccn.github.io/Quarterly_Submission_Receipts/000026-dashboard.html
000027/dandiset.yaml:  ATM contains only a few files from http://github.com/dandi-datasets/nwb_test_data
000035/dandiset.yaml:  url: https://github.com/berenslab/mini-atlas
000037/dandiset.yaml:  url: https://colleenjg.github.io/
000037/dandiset.yaml:  url: https://github.com/jeromelecoq/allen_openscope_metadata/tree/master/projects/credit_assignement
000037/dandiset.yaml:  url: https://github.com/colleenjg/OpenScope_CA_Analysis
000037/dandiset.yaml:  url: https://github.com/colleenjg/cred_assign_stimuli
000060/dandiset.yaml:  here: \n https://github.com/arsenyf/FinkelsteinFontolan_2021NN"
000064/dandiset.yaml:description: This is data produced by the Soltesz Lab NeuroH5 software (https://github.com/iraikov/neuroh5).
000064/dandiset.yaml:  The data has been converted to NWB using the ndx-simulation-output extension (https://github.com/catalystneuro/ndx-simulation-output).
000064/dandiset.yaml:  url: https://github.com/iraikov/neuroh5
000108/dandiset.yaml:  url: https://biccn.github.io/Quarterly_Submission_Receipts/000108-dashboard.html
000122/dandiset.yaml:  can be found at https://github.com/rob-luke/experiment-fNIRS-tapping.
000122/dandiset.yaml:  url: https://github.com/rob-luke/BIDS-NIRS-Tapping
000122/dandiset.yaml:  url: https://github.com/rob-luke/experiment-fNIRS-tapping
000127/dandiset.yaml:  of the Neural Latents Benchmark: https://neurallatents.github.io.'
000128/dandiset.yaml:  Latents Benchmark: https://neurallatents.github.io.'
000129/dandiset.yaml:  Neural Latents Benchmark: https://neurallatents.github.io.'
000130/dandiset.yaml:  of the Neural Latents Benchmark: https://neurallatents.github.io.'
000138/dandiset.yaml:  Benchmark: https://neurallatents.github.io.'
000139/dandiset.yaml:  Benchmark: https://neurallatents.github.io.'
000140/dandiset.yaml:  Benchmark: https://neurallatents.github.io.'
000165/dandiset.yaml:  url: https://github.com/emilyasterjones/interneurons_modulate_drive
000168/dandiset.yaml:  repository: github
000168/dandiset.yaml:  url: https://github.com/rozmar/jGCaMP8_ground_truth_dataset
000207/dandiset.yaml:  Example code on how to plot this data can be found at https://github.com/rutishauserlab/cogboundary-zheng
000207/dandiset.yaml:  repository: github
000207/dandiset.yaml:  url: https://github.com/rutishauserlab/cogboundary-zheng
000221/dandiset.yaml:  Example codes to plot data is at https://github.com/hidehikoinagaki/InagakiAndChenEtAl2022'
000222/dandiset.yaml:  Code and README can be found at https://github.com/JustinOHare/ICR_2022.git'
000231/dandiset.yaml:  repository: github
000231/dandiset.yaml:  url: https://github.com/cxrodgers/NwbDandiData2022
000402/dandiset.yaml:  url: https://github.com/datajoint/microns_phase3_nda
000404/dandiset.yaml:  https://github.com/pkhanna104/bmi_dynamics_code and archived at https://zenodo.org/record/8006653"
000405/dandiset.yaml:  table.\n\nPre-print DOI: \nhttps://doi.org/10.1101/2022.12.15.520660\n\nGithub:\nhttps://github.com/alexgonzl/TMA\n"
000462/dandiset.yaml:  Scripts used for analysis can be found on https://github.com/seethakris/HPCrewardpaper'
000465/dandiset.yaml:  [Electrode mapping information & Basic analysis codes] Github: https://ytchoe.github.io/'
000469/dandiset.yaml:  provided: \nhttps://github.com/rutishauserlab/workingmem-release-NWB\n"
000469/dandiset.yaml:  url: https://github.com/rutishauserlab/workingmem-release-NWB
000473/dandiset.yaml:  url: https://github.com/PierreLeMerre/Esr1_NPX_code
000483/dandiset.yaml:  url: https://github.com/ucsb-goard-lab/Neurotar-HD-Experiments
000540/dandiset.yaml:  on https://rhythm-n-rodents.github.io/software/.
000554/dandiset.yaml:  [Electrode mapping information & Basic analysis codes] Github: https://ytchoe.github.io/'
000557/dandiset.yaml:  https://ytchoe.github.io/"
000574/dandiset.yaml:  url: https://github.com/janhohenheim/usz-neuro-conversion
000574/dandiset.yaml:  url: https://github.com/janhohenheim/nwb-example
000575/dandiset.yaml:  url: https://github.com/janhohenheim/usz-neuro-conversion
000575/dandiset.yaml:  url: https://github.com/janhohenheim/nwb-example
000576/dandiset.yaml:  url: https://github.com/janhohenheim/usz-neuro-conversion
000576/dandiset.yaml:  url: https://github.com/janhohenheim/nwb-example
000579/dandiset.yaml:  notebook for a tutorial to read and extract information from these NWB files: https://github.com/sytseng/Notebook_for_Dandiset_000579\n\n-
000579/dandiset.yaml:  NWB extension code for custom lab meta data (required for reading NWB files): https://github.com/sytseng/ndx-harvey-swac
000579/dandiset.yaml:  \n\n- Code and tutorials for fitting GLM to neural activity in Tensorflow 2: https://github.com/sytseng/GLM_Tensorflow_2"
000579/dandiset.yaml:  url: https://github.com/sytseng/Notebook_for_Dandiset_000579
000579/dandiset.yaml:  url: https://github.com/sytseng/ndx-harvey-swac
000579/dandiset.yaml:  url: https://github.com/sytseng/GLM_Tensorflow_2
000582/dandiset.yaml:    url: https://github.com/dandi/dandi-cli
000618/dandiset.yaml:  This dataset was prepared using the following script: https://github.com/flatironinstitute/spikeforest/blob/main/devel/dandiset/prepare_dandiset.py
000623/dandiset.yaml:  Git Link: https://github.com/rutishauserlab/bmovie-release-NWB-BIDS'
000625/dandiset.yaml:  email: [email protected]
000630/dandiset.yaml:  Analysis code and extracted features available at https://github.com/AllenInstitute/patchseq_human_L1.
000630/dandiset.yaml:  Feature extraction package available at https://github.com/AllenInstitute/ipfx.'
000678/dandiset.yaml:  url: https://github.com/sjara/uobrainflex/tree/master/hulsey2023

Might be worthwhile working out a complete use case example. FWIW

Cons I see

might end up being "more difficult" due to possible various ways they decide to label etc issues. In particular if we were to do some "overall overhaul" (e.g. that consistent labeling etc)
it would be for authors to do that, and they likely would not, and users end up without a dandiset specific issues board linked from the dandiset landing page

bendichter · 2023-11-06T15:29:18Z

Does GitHub limit the number of repos in an organization?

Edit: Answer: no. "All organizations can own an unlimited number of public and private repositories." (source)

…out creating repos if org is provisioned

yarikoptic · 2023-11-22T23:17:03Z

Ok, seems no further questions/concerns. I have added section on minimal developments to be done on dandi-archive backend, and made it all conditional on having github organization provisioned.

@jwodder please also have a look/provide feedback.

waxlamp

I am tentatively ok with this design (we will learn more about its suitability/viability as we go), but I'd like to also consult with others in Kitware who may know of alternatives we haven't covered.

doc/design/dandiset-issues-tracking.md

That is, each instance will specify which org to use, rather than relying on out-of-band use of DataLad to create the repos. Co-authored-by: Yaroslav Halchenko <[email protected]>

yarikoptic · 2023-12-11T15:20:07Z

@bendichter mentioned https://github.com/giscus/giscus which interfaces discussions (not issues), and also that opens the ecosystem of other attempts at similar platforms (https://github.com/gitalk/gitalk, https://github.com/utterance/utterances etc) but they all seems to be "dead" as no devel/support.

bendichter · 2023-12-11T15:23:42Z

Discussion may actually be a slightly better option. It psychologically opens up the discussion to messages like questions or explanations rather than just problems. An issue can easily be created from a discussion as well in the GitHub web interface.

yarikoptic · 2024-01-25T17:21:03Z

doc/design/dandiset-issues-tracking.md

+- not "integrated" within dandiarchive.org as issues (meta)data would not be contained within DANDI.
+  - A possible mitigation: I guess we could collect mirror issues/comments etc from GitHub internally in the archive. There are tools which could even be used to help. E.g. @yarikoptic has experience with using https://github.com/MichaelMure/git-bug to sync all issues from GitHub locally to collect all contributors to the project. E.g. [this script](https://github.com/nipy/heudiconv-joss-paper/blob/main/authors/tools/make-summaries#L92) processes a JSON dump of all issues from `git bug`  mirror.
+
+[bot]: https://github.com/dandi/dandisets/issues/360


Suggested change

[bot]: https://github.com/dandi/dandisets/issues/360

[bot]: https://github.com/dandi/dandisets/issues/360

### Miscellaneous other possible future integrations with GitHub

- 2i2c uses github teams for resources management on the hub (ref: 1.1 within https://github.com/dandi/dandi-hub/pull/90/files#diff-82655098d9fb488babf6a5ce10d3d5f6a98d17b2f69de5ca28315e54e020bdf9R29)

waxlamp · 2024-02-03T16:06:45Z

#1313 mentioned utterances, which builds a comment stream over github issues. That is something to consider as well.

Design of issue tracking/reporting via GitHub

fe61ddc

yarikoptic mentioned this pull request May 22, 2023

Add UI link to file issues about a dandiset #863

Closed

yarikoptic commented Jun 5, 2023

View reviewed changes

doc/design/dandiset-issues-tracking.md Show resolved Hide resolved

yarikoptic mentioned this pull request Jun 30, 2023

[Feature Request]: Add 'Questions' button to redirect to issues page of dandiset #1643

Closed

CodyCBakerPhD reviewed Jul 7, 2023

View reviewed changes

doc/design/dandiset-issues-tracking.md Show resolved Hide resolved

bendichter and others added 2 commits July 17, 2023 22:57

Update dandiset-issues-tracking.md

b72efb7

Add section on Changes to dandi-archive web ui

f1385b7

Co-authored-by: Cody Baker <[email protected]>

yarikoptic mentioned this pull request Sep 21, 2023

dandiset issues #1684

Closed

magland mentioned this pull request Oct 13, 2023

request: optional link to github repo for discussion and issues #1705

Closed

Modify the approach -- we will invite owners to Triage issues and use…

a16e5bc

… bot

yarikoptic mentioned this pull request Oct 23, 2023

Design a "bot" to assist "managing" of the dandisets by authorized users dandi/dandisets#360

Open

2 tasks

waxlamp reviewed Oct 23, 2023

View reviewed changes

Add needed changes for dandi-archive backend which would take care ab…

467fcc3

…out creating repos if org is provisioned

mvandenburgh self-requested a review December 4, 2023 15:20

waxlamp self-assigned this Dec 4, 2023

waxlamp reviewed Dec 8, 2023

View reviewed changes

doc/design/dandiset-issues-tracking.md Outdated Show resolved Hide resolved

doc/design/dandiset-issues-tracking.md Outdated Show resolved Hide resolved

doc/design/dandiset-issues-tracking.md Show resolved Hide resolved

Use a specific organization to store issues

d8adab5

That is, each instance will specify which org to use, rather than relying on out-of-band use of DataLad to create the repos. Co-authored-by: Yaroslav Halchenko <[email protected]>

yarikoptic commented Jan 25, 2024

View reviewed changes

waxlamp assigned yarikoptic and unassigned waxlamp Feb 26, 2024

yarikoptic mentioned this pull request May 3, 2024

dandi-derived instance of DANDI? dandi/dandi-infrastructure#172

Open

yarikoptic marked this pull request as draft August 6, 2024 19:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design of issue tracking/reporting via GitHub #1595

Design of issue tracking/reporting via GitHub #1595

yarikoptic commented May 22, 2023 •

edited

Loading

CodyCBakerPhD commented Jun 5, 2023

bendichter commented Sep 21, 2023

yarikoptic commented Oct 13, 2023

bendichter commented Oct 16, 2023

waxlamp left a comment

bendichter commented Oct 24, 2023

yarikoptic commented Oct 24, 2023

magland commented Oct 24, 2023

yarikoptic commented Oct 25, 2023

bendichter commented Nov 6, 2023 •

edited

Loading

yarikoptic commented Nov 22, 2023

waxlamp left a comment

yarikoptic commented Dec 11, 2023

bendichter commented Dec 11, 2023

yarikoptic Jan 25, 2024

waxlamp commented Feb 3, 2024

Design of issue tracking/reporting via GitHub #1595

Are you sure you want to change the base?

Design of issue tracking/reporting via GitHub #1595

Conversation

yarikoptic commented May 22, 2023 • edited Loading

CodyCBakerPhD commented Jun 5, 2023

bendichter commented Sep 21, 2023

yarikoptic commented Oct 13, 2023

bendichter commented Oct 16, 2023

waxlamp left a comment

Choose a reason for hiding this comment

bendichter commented Oct 24, 2023

yarikoptic commented Oct 24, 2023

magland commented Oct 24, 2023

yarikoptic commented Oct 25, 2023

bendichter commented Nov 6, 2023 • edited Loading

yarikoptic commented Nov 22, 2023

waxlamp left a comment

Choose a reason for hiding this comment

yarikoptic commented Dec 11, 2023

bendichter commented Dec 11, 2023

yarikoptic Jan 25, 2024

Choose a reason for hiding this comment

waxlamp commented Feb 3, 2024

yarikoptic commented May 22, 2023 •

edited

Loading

bendichter commented Nov 6, 2023 •

edited

Loading